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Abstract. Access to the cloud has the potential to provide scalable and 
cost effective enhancements of physical devices through the use of ad¬ 
vanced computational processes run on apparently limitless cyber infras¬ 
tructure. On the other hand, cyber-physical systems and cloud-controlled 
devices are subject to numerous design challenges; among them is that of 
security. In particular, recent advances in adversary technology pose Ad¬ 
vanced Persistent Threats (APTs) which may stealthily and completely 
compromise a cyber system. In this paper, we design a framework for 
the security of cloud-based systems that specifies when a device should 
trust commands from the cloud which may be compromised. This in¬ 
teraction can be considered as a game between three players: a cloud 
defender/administrator, an attacker, and a device. We use traditional 
signaling games to model the interaction between the cloud and the de¬ 
vice, and we use the recently proposed Flipit game to model the strug¬ 
gle between the defender and attacker for control of the cloud. Because 
attacks upon the cloud can occur without knowledge of the defender, 
we assume that strategies in both games are picked according to prior 
commitment. This framework requires a new equilibrium concept, which 
we call Gestalt Equilibrium, a fixed-point that expresses the interdepen¬ 
dence of the signaling and Flipit games. We present the solution to this 
fixed-point problem under certain parameter cases, and illustrate an ex¬ 
ample application of cloud control of an unmanned vehicle. Our results 
contribute to the growing understanding of cloud-controlled systems. 


1 Introduction 

Advances in computation and information analysis have expanded the capabil¬ 
ities of the physical plants and devices in cyber-physical systems ('CPSl [iIT^ . 
Fostered by advances in cloud computing, CPS have garnered significant atten¬ 
tion from both industry and academia. Access to the cloud gives administrators 
the opportunity to build virtual machines that provide to computational re¬ 
sources with precision, scalability, and accessibility. 

Despite the advantages that cloud computing provides, it also has some draw¬ 
backs. They include - but are not limited to - accountability, virtualization, and 
security and privacy concerns. In this paper, we focus especially on providing ac¬ 
curate signals to a cloud-connected device and deciding whether to accept those 
signals in the face of security challenges. 



Recently, system designers face security challenges in the form of Advanced 
Persistent Threats {APTs) [TS]. APTs arise from sophisticated attackers who can 
infer a user’s cryptographic key or leverage zero-day vulnerabilities in order to 
completely compromise a system without detection by the system administrator 
[TH] , This type of stealthy and complete compromise has demanded new types 
of models for prediction and design. 

In this paper, we propose a model in which a device decides whether to trust 
commands from a cloud which is vulnerable to APTs and may fall under ad¬ 
versarial control. We synthesize a mathematical framework that enables devices 
controlled by the cloud to intelligently decide whether to obey commands from 
the possibly-compromised cloud or to rely on their own lower-level control. 

We model the cyber layer of the cloud-based system using the recently pro¬ 
posed Flipit game m^. This game is especially suited for studying systems 
under APTs. We model the interaction between the cloud and the connected de¬ 
vice using a signaling game, which provides a framework for modeling dynamic 
interactions in which one player operates based on a belief about the private 
information of the other. A significant body of research has utilized this frame¬ 
work for security |7I9I15I21I8] . The signaling and Flipit games are coupled, 
because the outcome of the Flipit game determines the likelihood of benign 
and malicious attackers in the robotic signaling game. Because the attacker is 
able to compromise the cloud without detection by the defender, we consider 
the strategies of the attacker and defender to be chosen with prior commitment. 
The circular dependence in our game requires a new equilibrium concept which 
we call a Gestalt equilibriurt^ We specify the parameter cases under which the 
Gestalt equilibrium varies, and solve a case study of the game to give an idea of 
how the Gestalt equilibrium can be found in general. Our proposed framework 
has versatile applications to different cloud-connected systems such as urban 
traffic control, drone delivery, design of smart homes, etc. We study one particu¬ 
lar application in this paper:ef control of an unmanned vehicle under the threat 
of a compromised cloud. 

Our contributions are summarized as follows: 


i) We model the interaction of the attacker, defender/cloud administrator, and 

cloud-connected device by introducing a novel game consisting of two coupled 
games: a traditional signaling game and the recently proposed Flipit game. 

ii) We provide a general framework by which a device connected to a cloud can 
decide whether to follow its own limited control ability or to trust the signal 
of a possibly-malicious cloud. 

iii) We propose a new equilibrium definition for this combined game: Gestalt 
equilibrium, which involves a fixed-point in the mappings between the two 
component games. 

iv) Finally, we apply our framework to the problem of unmanned vehicle control. 

In the sections that follow, we first outline the system model, then describe 
the equilibrium concept. Next, we use this concept to find the equilibria of the 

^ Gestalt is a noun which means something that is composed of multiple parts and 
yet is different from the combination of the parts [2]. 







game under selected parameter regimes. Finally, we apply our results to the 
control of an unmanned vehicle. In each of these sections, we first consider the 
signaling game, then consider the Flipit game, and last discuss the synthesis 
of the two games. Finally, we conclude the paper and suggest areas for future 
research. 


2 System Model 

We model a cloud-based system in which a cloud is subject to APTs. In this 
model, an attacker, denoted by A, capable of APTs can pay an attack cost to 
completely compromise the cloud without knowledge of the cloud defender. The 
defender, or cloud administrator, denoted by V, does not observe these attacks, 
but has the capability to pay a cost to reclaim control of the cloud. The cloud 
transmits a message to a robot or other device, denoted by TZ. The device may 
follow this command, but it is also equipped with an on-board control system 
for autonomous operation. It may elect to use its autonomous operation system 
rather than obey commands from the cloud. 

This scenario involves two games: the Flipit game introduced in m. and the 
well-known signaling game. The Flipit game takes place between the attacker 
and cloud defender, while the signaling game takes place between the possibly- 
compromized cloud and the device. For brevity, denote the Flipit game by Gp, 
the signaling game by Gs, and the combined game - call it CloudControl - by 
Gcc as shown in Fig. In the next subsections, we formalize this game model. 

2.1 Cloud-Device Signaling Game 

Let 0 denote the type of the cloud. Denote compromized and safe types of clouds 
by Oa and 9xi in the set 0. Denote the probabilities that 9 = 9a and that 9 = 9xi 
by p and 1 — p. Signaling games typically give these probabilities apriori, but in 
CloudControl they are determined by the equilibrium of the Flipit game Gp- 

Let mpf and mp denote messages of high and low risk, respectively, and 
let m € M = represent a message in general. After TZ receives the 

message, it chooses an action, a G A = {aT,aAr}, where op represents trusting 
the cloud and represents not trusting the cloud. 

For the device TZ, let : 0 x M x A ^ ^-ji, where C K. is a 
utility function such that {9, m, a) gives the device’s utility when the type 
is 9, the message is m, and the action is a. Let : M x A ^ ‘^a C K and 
Up, : M X A^ C K be utility functions for the attacker and defender. Note 
that these players only receive utility in Gs if their own type controls the cloud 
in Gp, so that type is not longer a necessary argument for u^ and Up. 

Denote the strategy of 7^ by cr.^ : A —>• [0,1], such that (a | m) gives the 
mixed-strategy probability that TZ plays action a when the message is m. The 
role of the sender may be played by A or I? depending on the state of the cloud, 
determined by Gp. Let cr^ : M —)• [0,1] denote the strategy that A plays when 
she controls the cloud, so that cr^ (m) gives the probability that A sends message 


m. (The superscript S specifies that this strategy concerns the signaling game.) 
Similarly, let : M ^ [0,1] denote the strategy played by V when he controls 
the cloud. Then (m) gives the probability that V sends message m. Let 
r^, and denote the sets of mixed strategies for each player. 

ForT S define functions : F^xF^ —>• , such that {cr^,cr^) 

gives the expected utility to sender X when he or she plays mixed-strategy cr|. 
and the receiver plays mixed-strategy cr^. Equation Q gives u^. 

ux cTx) (m, a) {a \ m) (j% (m), T G {A, V} (1) 

Next, let /r : O ^ [0,1] represent the belief of TZ, such that /r (0 | m) gives the 
likelihood with which 71 believes that a sender who issues message m is of type 9. 
Then define : F^ -A such that (cr^ \m,fi{»\ m)) gives the expected 
utility for TZ when it has belief /r, the message is m, and it plays strategy a^. 

is given by 

un I = X! a) 9-R I "i) {a\m). (2) 

9^0a^A 

The expected utilities to the sender and receiver will determine their incen¬ 
tives to control the cloud in the game Gf described in the next subsection. 

2.2 Flipit Game for Cloud Control 

The basic version of Flipit [200 is played in continuous time. Assume that the 
defender controls the resource - here, the cloud - at < = 0. Moves for both players 
obtain control of the cloud if it is under the other player’s control. In this paper, 
we limit our analysis to periodic strategies, in which the moves of the attacker 
and the moves of the defender are both spaced equally apart, and their phases 
are chosen randomly from a uniform distribution. Let /_4 G K+ and /p G M+ 
(where K+ represents non-negative real numbers) denote the attack and renewal 
frequencies, respectively. 

Players benefit from controlling the cloud, and incur costs from moving. Let 
wx (t) denote the average proportion of the time that player X G {H, A} has 
controlled the cloud up to time t. Denote the number of moves up to t per 
unit time of player X by Zx (t). Let ax> and represent the costs of each 
defender and attacker move. In the original formulation of Flipit, the authors 
consider a fixed benefit for controlling the cloud. In our formulation, the benefit 
depends on the equilibrium outcomes of the signaling game Gs- Denote these 
equilibrium utilities of V and A by u^* and . These give the expected benefit 

^ See [50] for a more comprehensive definition of the players, time, game state, and 
moves in Flipit . Here, we move on to describing aspects of our game important for 
analyzing Gcc. 




Fig. 1. The CloudControl game. The Flipit game models the interaction between 
an attacker and a cloud administrator for control of the cloud. The outcome of this 
game determines the type of the cloud in a signaling game in which the cloud conveys 
commands to the robot or device. The device then decides whether to accept these 
commands or rely on its own lower-level control. The Flipit and signaling games are 
played concurrently. 


of controlling the cloud. Finally, let {t) and (t) denote the time-averaged 
beneht of V and A up to time t in Gf- Then 

ux (t) = u%* wx {t) - axzx {t), X G {V, A} , (3) 

and, as time continues to evolve, the average benefits over all time become 

liminf u%*wx (t) - axzx (t), X G {D,A} . (4) 

t—>-oo 

We next express these expected utilities over all time as a function of periodic 
strategies that V and A employ. Let : K+ x ]R_|_ —K, df G {V, A} be 
expected utility functions such that (/p, fX) and (/p, fjC) give the average 
utility to T) and A, respectively, when they play with frequencies /p and /^i. If 
f-v > Ja > 0, it can be shown that 

«P ifvjA) = 4* (l - (5) 




o^aJa 


( 6 ) 













while if 0 < /-D < /^, then 



( 7 ) 


Ua U-dJa) = Ua 




( 8 ) 


and if /_4 =0, we have 


Ua UvJa) = 0 , (UJa) = Uv - avfv- 


(9) 


Equations §-§ with Equation Q for m|,, A” S {'D,A} and Equation ([^ 
for will be main ingredients in our equilibrium concept in the next section. 

3 Solution Concept 

In this section, we develop a new equilibrium concept for our CloudControl game 
Gcc- We study the equilibria of the Flipit and signaling games individually, 
and then show how they can be related through a fixed-point equation in order 
to obtain an overall equilibrium for Gcc- 

3.1 Signaling Game Equilibrium 

Signaling games are a class of dynamic Bayesian games. Applying the concept 
of perfect Bayesian equilibrium (as it e.g.^ |10j l to Gs, we have Definitionj^ 

Definition 1 Let the functions uf, , X € {H, A} and 

formulated according to Equation 0 and Equation respectively. Then a 
perfect Bayesian equilibrium of the signaling game Gs is a strategy profile 
(cTp*, cr^*,cr^) and posterior beliefs /i (• | m) such that 


VA e {V,A} , crl* (•) G argmaxwl (cr-|*,cr|.) , 


( 10 ) 


Vm G M, (• I m) G argmax (cr^ | m, ^ (• | m)) , (11) 




1 {g = crj(* (m) p + 1 {9 = d-ri} cTy jm) (1 - p) 

a^* (m) p + a^* (to) (1 - p) 


( 12 ) 


if (to) p + CTp* (to) (1 — p) 7 ^ 0, and 


p{9\m) = any distribution on 0, 


(13) 


*/ ^A (’^) P + i^) (1 - P) = 0- 



Next, let ttp*, , and be the utilities for the defender, attacker, and de¬ 
vice, respectively, when they play according to a strategy profile (o'c*) cr-^*) 

and belief /i (• | m) that satisfy the conditions for a perfect Bayesian equilibrium. 
Define a set-valued mapping : [0,1] —such that {p;Gs) gives 
the set of equilibrium utilities of the defender and attacker when the prior prob¬ 
abilities are p and 1 — p and the signaling game utilities are parameterized by 
GJB We have 

{(4*,4*)}=T^(p;Gs). (14) 

We will employ T'® as part of the definition of an overall equilibrium for Gcc 
after examining the equilibrium of the Flipit game. 

3.2 Flipit Game Equilibrium 

The appropriate equilibrium concept for the Flipit game, when A and V are 
restricted to periodic strategies, is Nash equilibrium [14) . Definition [^applies the 
concept of Nash Equilibrim to Gp- 

Definition 2 A Nash equilibrium of the game Gp is a strategy profile {ff,, fff) 
such that 

ff, G argmaxug (/-p, /^), (15) 

It, 

f*^ G argmax ug {ff,, fj) , (16) 

fA 

where Up and are computed by Equation ^ and Equation @ /x> > fA 

and Equation ^ and Equation |§^ffv<fA■ 

To find an overall equilibrium of Gcc, we are interested in the proportion 
of time that A and V control the cloud. As before, denote these proportions by 
p and 1 — p, respectively. These proportions (as in i) can be found from the 
equilibrium frequencies by 


I 0, if fA = 0 

p = S if /x> > /.4 > 0 . (17) 

[l-^, if fA>fv>0 

Let Gp parameterize the Flipit game. Now, we can define a mapping : 
Ux> V-Ua —t [0,1] such that the expression ,u^-, Gp) gives the proportion 

of time that the attacker controls the cloud in equilibrium from the values of 
controlling the cloud for the defender and the attacker. This mapping gives 

p = T^{ul,*,u^/-,Gp). (18) 

In addition to interpreting p as the proportion of time that the attacker 
controls the cloud, we can view it as the likelihood that, at any random time, 


^ Since TZ does not take part in Gs, it is not necessary to include Up as an output of 
the mapping. 






Fig. 2. Gs and Gf interact because the utilities in the Flipit game are derived from 
the output of the signaling game, and the output of the Flipit game is used to define 
prior probabilities in the signaling game. We call the fixed-point of the composition of 
these two relationships a Gestalt equilibrium. 


the cloud will be controlled by the attacker. Of course, this is precisely the value 
p of interest in Gs. Clearly, Gp and Gs are coupled by Equations (141 and (18). 
These two equations specify the overall equilibrium for the CloudControl game 
Gcc through a fixed-point equation, which we describe next. 


3.3 Gestalt Equilibrium of Gcc 

When the CloudControl game Gcc is in equilibrium the mapping from the pa¬ 
rameters of Gs to that game’s equilibrium and the mapping from the parameters 
of Gp to that game’s equilibrium are simultaneously satisfied as shown in Fig. 
Definition formalizes this equilibrium, which we call Gestalt equilibrium. 

Definition 3 (Gestalt equilibrium) The cloud control ratio G [0,1] and 
equilibrium signaling game utilities and constitute a Gestalt equilibrium 
of the game Gcc composed of coupled games Gs and Gp if the two components 
of Equation are simultaneously satisfied. 

(4^ 4^) e {p^;Gs), P^=T^ (u^\u^J-,Gf) (19) 

In short, the signaling game utilities (u ^, ^ must satisfy the fixed-point 

equation 

(4^ 4^) G (t^ (4^ Gp) ; Gs) . (20) 

In this equilibrium, A receives according to Equation |^, Equation 
or Equation V receives according to Equation j^, Equation Q), or 
Equation and TZ receives according to Equation Qb 















Solving for the equilibrium of Gcc requires a fixed-point equation essentially 
because the games Gp and Gs are played according to prior committment. 
Prior commitment specifies that players in Gg do not know the outcome of Gp- 
This structure prohibits us from using a sequential concept such as sub-game 
perfection and suggests instead a fixed-point equation. 


4 Analysis 

In this section, we analyze the game proposed in Section [phased on our solution 
concept in Section First, we analyze the signaling game and calculate the 
corresponding equilibria. Then, we solve the Flipit game for different values of 
expected payoffs resulting from signaling game. Finally, we describe the solution 
of the combined game. 


4.1 Signaling Game Analysis 


The premise of Gcc allows us to make some basic assumptions about the utility 
parameters that simplifies the search for equilibria. We expect these assumptions 
to be true across many different contexts. 

Al) M7?,(0-d, TOp, ot) > W7?,(6*X), wl, otv): It is beneficial for the receiver to trust 
a low risk message from the defender. 

A2) mpf, ot) < mpr, Oat): It is harmful for the receiver to trust 

a high risk message from the attacker. 

A3) Vm,m' € M, u^(m, ot) > '«. 4 (m',aAr) and Vto,to' G M ,it'p(m,aT) > 
ux>{m'^ qn)'- Both types of sender prefer that either of their messages is 
trusted rather than that either of their messages is rejected. 

A4) > Uj\{mr,aT)'. The attacker prefers an outcome in which the 

receiver trusts his high risk message to an outcome in which the receiver 
trusts his low risk message. 


Pooling equilibria of the signaling game differ depending on the prior prob¬ 
abilities p and p — 1. Specihcally, the messages on which A and T) pool and the 
equilibrium action of TZ depend on quantities in Equations (21) and (22) which 
we call trust benefits. 


TBh (p) 


p[un {dA,mH,aT) - u-ji (6»^,m_f/, oat)] 

+ {l-p) [un {9'D,mH,aT) - u-r {d'D,mH,aN)\ 


( 21 ) 


TBl {p) 


P [un {Oa, ruL, ar) - ur (6>^, mi, qn)] 

+ {l-p) [ur {9v,mL,aT) - ur tol, oat)] 


( 22 ) 


TBr (p) and TBl{p) give the benefit of trusting (compared to not trust¬ 
ing) high and low messages, respectively, when the prior probability is p. These 
quantities specify whether TZ will trust a message that it receives in a pooling 
equilibrium. If TBr [p) (respectively, TBl (p)) is positive, then, in equilibrium, 
TZ will trust all messages when the senders pool on mn (respectively, m^). 




TB„ (p) 



Separating Equilibrium 

Uti {6ji,mi,aT) < 

< UTi{6D,mff,o.N) 


Pooling I Pooling 

with Belief Restrictions I with Belief Restrictions 

A : niH T) : mu IZ : I A: mn V : mn TZ : Uiv 

A : rriL T> ; rtiL TZ ■. y A', mi V : vii TZ : Ot 

Fig. 3. The four quadrants represent parameter regions of Gs ■ The regions vary based 
on the tpyes of pooling equilibria that they support. For instance, quadrant IV supports 
a pooling equilibrium in which A and T> both send mu and TZ plays aN, as well as 
a pooling equilibrium in which A and TT both send niL and TZ plays ax- The shaded 
regions denote special equilibria that occur under further parameter restrictions. 


We illustrate the different possible combinations of TBh {p) and TBi (p) in 
the quadrants of Fig. |4.1[ The labeled messages and actions for the sender and 
receiver, respectively, in each quadrant denote these pooling equilibria. These 
pooling equilibria apply throughout each entire quadrant. Note that we have 
not listed the requirements on belief p, here. These are addressed in the Ap¬ 
pendix and become especially important for various equilibrium refinement 
procedures. 


The shaded regions of Fig. 4.1 denote additional special equilibria which 
only occur under the additional parameter constraints listed within the regions. 
(The geometrical shapes of the shaded regions are not meaningful, but their 
overlap and location relative to the four quadrants are accurate.) The dotted 
and uniformly shaded zones contain equilibria similar to those already denoted 
in the equilibria for each quadrant, except that they do not require restrictions on 
p. The zone with horizontal bars denotes the game’s only separating equilibrium. 
It is a rather unproductive one for T) and A, since their messages are not trusted. 
(See the derivation in Appendix A.l ) The equilibria depicted in Fig. 4.1 will 
become the basis of analyzing the mapping (p; Gs), which will be crucial for 
forming our fixed-point equation that defines the Gestalt equilibrium. Before 
studying this mapping, however, we first analyze the equilibria of the Flipit 
game on its own. 

































4.2 Flipit Analysis 


In this subsection, we calculate the Nash equilibrium in the Flipit game. Equa¬ 
tions (§-§ represent both players’ utilities in Flipit game. The solution of 
this game is similar to what has presented in mm. except that the reward of 
controlling the resource may vary. To calculate Nash equilibrium, we normalize 
both players’ benefit with respect to the reward of controlling the resource. For 
different cases, the frequencies of move at Nash equilibrium are: 


axi 

“■D 


a-D 

yS* 
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ax> 

yS* 

“■D 


aA 


0<A 

'ji'S'* 


O^A 
0-1S* 




j,A ^ 


< 0 : 


and u ^*, 


> 0 : 


fv = 


'Ti’S'* 


2a_4 


fA = 


a-D 




and 


^T> 


> 0 : 


fi = 


aA 


2a^ 




fA = 


-S* 
2ay ’ 


and > 0: 


fy = 


u^* 


A f* _ 

•2a a' 


-s* 

2ay ’ 


(23) 


(24) 


(25) 


fi = rA = 0 , ( 26 ) 

• •^A ^ 6 and u^* < 0: 

/i. = 0 a = 0+. (27) 

In the case that < 0, the attacker has no incentive to attack the cloud. 
In this case, the defender need not move since we assume that she controls the 
cloud initially. In the case that > 0 and uf,* < 0, only the attacker has 
an incentive to control the cloud. We use = O'*" to signify that the attacker 
moves only once. Since the defender never moves, the attacker’s single move is 
enough to retain control of the cloud at all times. 

Next, we put together the analysis of Gs and Gp in order to study the 
Gestalt equilibria of the entire game. 


4.3 Gcc Analysis 


To identify the Gestalt Equilibrium of Gcc, it is necessary to examine the 


mapping {p;Gs) for all p G [0,1]. As noted in Section 4.1, this mapping 


depends on TBh (p) and TBl{p). From assumptions A1-A4, it is possible to 
verify that {TBl (0) ,TBh {0)) must fall in Quadrant I or Quadrant IV and 
that (TBl (1) ,TBh (1)) must lie in Quadrant III or Quadrant IV. There are 
numerous ways in which the set {TBl {p) ,TBh (p)), P & [0,1] can transverse 












Parameter Values 

UTz{0T:,,mH.aT) = 3 

u-jiiO-u.mi.aT) = 5 

uji {Oj,, mn-ar) = —8 

Un{SA.mL,aT) = 1 

Uj) {niH, ar) - 4 

uv {mL, ar) = 3 

Ua (m„, Oj.) = 6 

ua ar) = 2 

uji (On, mn, flAf) = 1 

uti mi, uat) = 0 

“K {Oa, mu, ««) = 1 

^^i, an) = 0 

u-ulmuAN) = -2 

Uv {mL,aN) ~ -1 

ua (ma, ajv) = -5 

UA{mi,aA = -1 


Pooling 

with Belief Restrictions 
-10 A : m.H V : iuh 'R- : 





Pooling 

with Belief Restrictions 
A ; mi V ; mi TZ : 


Fig. 4. For the parameter values overlayed on the figure, as p ranges from 0 to 1, 
TBh (p) and TBp (p) move from Quadrant I to Quadrant IV. The equilibria supported 
in each of these quadrants, as well as the equilibria supported on the interface between 
them, are presented in Table 


different parameter regions. Rather than enumerating all of them, we consider 
one here. 


Consider parameters such that TBp (0) ,TBh (0) > 0 and TBp (1) > 0 but 
TBh (1) < (fl This leads to an .if that will traverse from Quadrant I to Quad¬ 
rant IV. Let us also assume that uv {mL,aT) < ux> {mH,aT), so that Equilib¬ 
rium 5 is not feasible. In Fig. |4.3[ we give specific values of parameters that 
satisfy these conditions, and we plot {TBp {p) ,TBh {p)) for p G [0,1]. Then, in 
Table we give the equilibria in each region that the line segment traverses. 
The equilibrium numbers refer to the derivations in the Appendix |A.2[ 


These parameters must satisfy utj m//, ay) > u-n , 0 ,^) and 

UTi(dA,'mL,aT) > u-jz{9A,mL,aN)- Here, we give them specific values in order 
to plot the data. 
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Table 1 . Signaling game equilibria by region for a game that traverses between Quad¬ 
rant I and Quadrant IV. Some of the equilibria are feasible only for constrained beliefs 
specified in Appendix m We argue that the equilibria in each region marked by 
(*) will be selected. 


Region 

Equilibria 

Quadrant I 

Equilibrium 3: Pool on mr.; p constrained; 77. plays ax 
^Equilibrium 8: Pool on mn', p unconstrained; 77 plays ax 

TBh (p) = 0 Axis 

^Equilibrium 3: Pool on mx; p constrained; 77 plays ax 
Equilibrium 8: Pool on mn', p unconstrained; 77 plays ax 
Equilibrium 6: Pool on mn', p constrained; 77 plays aN 

Quadrant IV 

^Equilibrium 3: Pool on mx', g constrained; 77 plays ax 
Equilibrium 6: Pool on mn', p constrained; 77 plays ajv 


If p is such that the signaling game is played in Quadrant I, then both senders 
prefer pooling on mn- By the first mover advantage, they will select Equilibrium 
8. On the border between Quadrant I and Quadrant IV, A and V both prefer 
an equilibrium in which TZ plays ax- If they pool on tol, this is guaranteed. If 
they pool on ttih, however, TZ receives equal utility for playing ax and a^; thus, 
the senders cannot guarantee that the receiver will play ax- Here, we assume 
that the senders maximize their worst-case utility, and thus pool on mi,. This is 
Equilibrium 3. Finally, in Quadrant IV, both senders prefer to be trusted, and 
so select Equilibrium 3. From the table, we can see that the utilities will have a 
jump at the border between Quadrant I and Quadrant IV. The solid line in Fig. 
^ plots the ratio u^/u^* of the utilities as a function of p. 

Next, consider the mapping p = have noted, p depends 

only on the ratio / u^* ^ . Indeed, it is continuous in that ratio when the 
outcome at the endpoints is appropriately defined. This mapping is represented 
by the dashed line in Fig. with the independent variable on the vertical axis. 

We seek a fixed-point, in which p = ~ (p)- 

This shown by the intersection of the solid and dashed curves plotted in Fig. 

At these points, the mappings between the signaling and the Flipit games are 
mutually satisfied, and we have a Gestalt equilibriumj^ 


® When = -uf,* = 0, we define that ratio to be equal to zero, since this will yield 
fA = 0 and p = 0, as in Equations (j^ and ( |l7[ ). When u%* — 0 and > 0, it is 
convenient to consider the ratio to be positively infinite since this is consistent with 

1 . 

® Note that this example featured a discontinuity in signaling game utilities on the 
border between equilibrium regions. Interestingly, even when the pooling equilibria 
differ between regions, it is possible that the equilibrium on the border admits a 
mixed strategy that provides continuity between the different equilibria in the two 
regions, and thus makes T® continuous. This could allow Gcc to have multiple 
Gestalt equilibria. 
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Fig. 5. and T® are combined on a single set of axis. In T'® (the solid line), the 
independent variable is on the horizontal axis. In (the dashed line), the independent 
variable is on the vertical axis. The intersection of the two curves represents the Gestalt 
equilibrium. 


5 Cloud Control Application 


In this section, we describe one possible application of our model: a cyber¬ 
physical system composed of autonomous vehicles with some on-board control 
but also with the ability to trust commands from the cloud. Access to the cloud 
can offer automated vehicles several benefits [12]. First, it allows access to mas¬ 
sive computational resources - i.e., infrastructure as a service (laaS). (See |5|.) 
Second, it allows access to large datasets. These datasets can offer super-additive 
benefits to the sensing capabilities of the vehicle itself, as in the case of the de¬ 
tailed road and terrain maps that automated cars such as those created by 
Google and Delphi combine with data collected by lidar, radar and vision-based 
cameras [nm. Third, interfacing with the cloud allows access to data collected 
or processed by humans through crowd-sourcing applications; consider, for in¬ 
stance, location-based services mm that feature recommendations from other 
users. Finally, the cloud can allow vehicles to collectively learn through experi¬ 
ence [T^ . 

Attackers may attempt to influence cloud control of the vehicle through sev¬ 
eral means. In one type of attack, adversaries may be able to steal or infer cryp¬ 
tographic keys that allow them authorization into the network. These attacks 
are of the complete compromise and stealth types that are studied in the Flipit 
framework HO], 0 and thus are appropriate for a CloudControl game. Flipit 
also provides the ability to model zero-day exploits, vulnerabilities for which a 
patch is not currently available. Each of these types of attacks on the cloud pose 
threats to unmanned vehicle security and involve the complete compromise and 
steathiness that motivate the Flipit framework. 










5.1 Dynamic Model for Cloud Controlled Unmanned Vehicles 

In this subsection, we use a dynamic model of an autonomous car to illustrate one 
specific context in which a cloud-connected device could be making a decision 
of whether to trust the commands that it would receive or to follow its own 
on-board control. 



Fig. 6. A bicycle model is a type of representation of vehicle steering control. Here, 
5 {t) is used to denote the angle between the orientation of the front wheel and the 
heading 6 (t). The deviation of the vehicle from a straight line is given by (t) 


We consider a car moving in two-dimensional space with a fixed speed vq but 
with steering that can be controlled. (See Fig. which illustrates the “bicycle 
model” of steering control from [5].) For simplicity, assume that we are interested 
in the car’s deviation from a straight line. (This line might, e.g., run along the 
center of the proper driving lane.) Let z (t) denote the car’s vertical distance 
from the horizontal line, and let 9 (t) denote the heading of the car at time t. 
The state of the car can be represented by a two-dimensional vector w (t) = 
[z{t) 6 (t) ] . Let 6 {t) denote the angle between the orientation of the front 
wheel - which implements steering - and the orientation of the length of the car. 
We can consider 5 (t) to be the input to the system. Finally, let y (t) represent 
a vector of outputs available to the car’s control system. The self-driving cars 
of both Google and Delphi employ radar, lidar, and vision-based cameras for 
localization. Assume that these allow accurate measurement of both states, such 
that yi (t) = z (t) and 2/2 (t) — d (t)- If the car stays near ic (t) = [ 0 0 ] , then 
we can approximate the system with a linear model. Let a and b denote the 
distances from the rear wheel to the center of gravity and the rear wheel to the 
front wheel of the car, respectively. Then the linearized system is given in M by 
the equations: 
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5.2 Control of Unmanned Vehicle 


Assume that the unmanned car has some capacity for automatic control with¬ 
out the help of the cloud, but that the cloud typically provides more advanced 
navigation. 

Specifically, consider a control system onboard the unmanned vehicle de¬ 
signed to return it to the equilibrium w (t) = [ 0 0 ] . Because the car has access 
to both of the states, it can implement a state-feedback control. Consider a 
linear, time-invariant control of the form 


^car (t) 


[kik2] 
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(30) 


This proportional control results in the closed-loop system 
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(31) 


The unmanned car TZ may also elect to obtain data or computational re¬ 
sources from the cloud. Typically, this additional access would improve the con¬ 
trol of the car. The cloud administrator (defender T>), however, may issue faulty 
commands or there may be a breakdown in communication of the desired sig¬ 
nals. In addition, the cloud may be compromised by ^ in a way that is stealthy. 
Because of these factors, TZ sometimes benefits from rejecting the cloud’s com¬ 
mand and relying on its own navigational abilities. Denote the command issued 
by the cloud at time t by Sdoud (t) G Sa (t ), 6x> (t ), depending on who controls 
the cloud. With this command, the system is given by 
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(32) 


5.3 Filter for High Risk Cloud Commands 

In cloud control of an unmanned vehicle, the self-navigation state feedback input 


given by Scar (t) in Equation (30) represents the control that is expected by the 
vehicle given its state. If the signal from the cloud differs significantly from the 
signal given by the self-navigation system, then the vehicle may classify the 
message as “high-risk.” Specifically, define a difference threshold r, and let 


m 


_ j : if (t) S car {t)\> 

if (^) ^ car {t)\< 


(33) 


Equation (33) translates the actual command from the cloud (controlled by V 


or A) into a message in the cloud signaling game. 


Equations (31) and (32) give the dynamics of the unmanned car electing to 


trust and not trust the cloud. Based on these equations. Fig. illustrates the 
combined self-navigating and cloud controlled system for vehicle control. 
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Fig. 7. Block-diagram model for unmanned vehicle navigation control. At any time, 
the vehicle uses strategy to decide whether to follow its own control or the control 
signal from the cloud, which may be 5a or 5t>, depending on the probabilities p, 1 — p 
with which A and T> control the cloud. Its own control signal. Scar, is obtained via 
feedback control. 


6 Conclusion and Future Work 

In this paper, we have proposed a general framework for the interaction between 
an attacker, cloud administrator/defender, and cloud-connected device. We have 
described the struggle for control of the cloud using the Flipit game and the 
interaction between the cloud and the connected device using a traditional sig¬ 
naling game. Because these two games are played by prior commitment, they are 
coupled. We have defined a new equilibrium concept - i.e., Gestalt equilibrium, 
which defines a solution to the combined game using a fixed-point equation. Af¬ 
ter illustrating various parameter regions under which the game may be played, 
we solved the game in a sample parameter region. Finally, we showed how the 
framework may be applied to unmanned vehicle control. 

Several directions remain open for future work. First, the physical compo¬ 
nent of the cyber-physical system can be further examined. Tools from optimal 
control such as the linear-quadratic regulator could offer a rigerous framework 
for defining the costs associated with the physical dynamic system, which in turn 
would define the payoffs of the signaling game. Second, future work could search 
for conditions under which a Gestalt equilibrium of the CloudControl game is 
guaranteed to exist. Finally, devices that use this framework should be equipped 
to learn online. Towards that end, a learning algorithm could be developed that 
is guaranteed to converge to the Gestalt equilibrium. Together with the frame- 
























work developed in the present paper, these directions would help to advance our 
ability to secure cloud-connected and cyber-physical systems. 
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A Derivation of Signaling Game Eqnilibria 

In this appendix, we solve for the equilibria of Gs. 

A.l Separating Equilibria 

First, we search for separating equilibria of Gs. In separating equilibria, TZ knows 
with certainty the type of the cloud. 

X> plays rriL and A. plays tuh If 'D plays rriL (as a pure strategy) and A 
plays rriH, then the receiver rejects any rriH according to assumption A2. The 
best action for A is to deviate to niL- Thus, this is not an equilibrium. 

T> plays niH and A plays mi, If U plays mn and A plays wll, the TZ's best re¬ 
sponse depends on the utility parameters. If {9^, rriL, ot) < (0yl, mL,aN) 

and {9T>,mH,aT) < (6*i), mu, oat), then TZ plays gn in response to both 

messages. There is no incentive to deviate. Denote this separating equilibrium 
as Equilibrium 

Itu^ i9A,mL,aT) < {9A,mL,aN) and {9v,mH,aT) > {9v,mH,aN), 

then Oat is within the set of best responses to toi, whereas ap is the unique best 
response to mn- Assuming that he prefers to certainty receive a higher utility, 

A deviates to mn- 

If i9A,mL,aT) > {9A,rnL,aN) andrt^ {9T>,mH,aT) < {9v,mH,aN), 

then Oat is within the set of best responses to mn, whereas ap is the unique best 
response to mp. Thus, V deviates to mp. 

Itu^ {9A,mp,ap) > {9A,mp,aN) and {0v,mH,ap) > {9v,mH,aN), 

then TZ plays ap in response to both messages. We have assumed, however, that 
A prefers to be trusted on mu compared to being trusted on mp (A4), so A 
deviates and this is not an equilibrium. 

A.2 Pooling Equilibria 

Next, we search for pooling equilibria of Gs. In pooling equilibria, TZ relies only 
on the prior probabilities p and 1 — p in order to form his belief about the type 
of the cloud. The existence of pooling equilibria depend essentially on the trust 
benefits TBh (p) and TBp{p). 



Pooling on rriL If TBl (p) < 0, then TVs best response is oat. This will only 
be an equilibrium if his best response to mn would also be a^v- This is the case 
only when the belief satisfies 


M (^-4 I rnH) un {Oa, rnn, ax) V {I - im{0a\ mn)) u-ji {9 -d, mn, ax) , 04 ') 

</r (0_4 I m//) mk (0^, TO//, ajv) + (1 - M (6*^ I m//)) mk (6»-p, TO//, Oat) ■ 

Moreover, this can only be an equilibrium when neither A nor TT have an incen¬ 
tive to deviate: i.e., when 

Ua [mH, ojv) < Ua (itil, oat) and (to//, uat) < (to/,, gn) ■ (35) 


If these conditions are satisfied, then denote this equilibrium by Equilibrium #1. 

If TBx (p) > 0, then 7^’s best response us ax- Whether this represents 
an equilibrium depends on if or I? have incentives to deviate from to/,. If 


if) (w/,,ot) and rtf (mH,ax) < n-f (mx,ax), then neither has 


uf {mH,ax) < 

an incentive to deviate. This is Equilibrium #5. If one of these inequalities does 
not hold, then the player who prefers to.// to mx will deviate if TZ would play 
ax in response to the deviation. The equilibrium condition is narrowed to when 
the belief makes TZ not trust to//; when Equation (341 is satisfied. Call this 
Equilibrium #3. 


Pooling on m/j The pattern of eqnilibria for pooling on to// follows a similar 
structure to the pattern of equilibria for pooling on to/,. 

HTBh (p) < 0, then TVs best response is oat. This will only be an equilibrium 
if his best response to mx would also be ajv- This is the case only when the belief 
satisfies 


P {Qa I mx)uTi {0A,rnx,ax) + {I - p{9a \ mx))un {9-D,mx,ax) 

< p{9a\ mx) Un i9A,mx,aN) + [1 - p{9a \ mx)) un {9v,mx,aN) ' 
To guarantee that A and V do not deviate, we require 


(36) 


Ua (to//, Un) > Ua {mx, an) and rtf (to//, gn) > wf {mx, gn) ■ (37) 

If these conditions are satisfied, then we have Equilibrium #6. 

UTBh > 0, then TZ’s best response is ax- If wf {mn, ux) > uf {mx,ax) and 
Ua {mn, o-x) > nf {mx,ax), then neither A nor V have an incentive to deviate. 
Call this Equilibrium #8- If one of these inequalities does not hold, then the 


belief must satisfy Equation (361 for an equilibrium to be sustained. Denote this 


equilibrium by Equilibrium #7. 




